Lecture 19: Spectral Clustering

نویسنده

  • Akshay Krishnamurthy
چکیده

We are given n data points x1, . . . , xn and some way to compute similarities between them, call si,j the similarity between the n points. Assume that the similarity function is symmetric, so si,j = sj,i. For the purposes of this lecture let us just consider the simplified clustering problem where we would like to split the dataset into two clusters. We would like to find a subset S ⊂ [n] of size roughly n/2 such that the similarity between points in S (and S) is high, while the similarity between the clusters is low. We do not assume that we have any object features, just the similarities si,j . Using these, we construct a similarity graph G = (V,E) where the vertices are the n objects, so |V | = n and the edges are weighted with si,j . In graph-theoretic terminology, we want to partition the graph so that the edge between the partitions have low weight while the edges within each side of the partition have high weight. Let us form the adjacency matrix of this graph, which we will always call A ∈ Rn×n. This matrix has Ai,j = si,j and is unspecified on the diagonal (don’t worry we won’t really use it at all). Define the degree matrix D ∈ Rn×n which is a diagonal matrix with Di,i = ∑n j=1 si,j , the (weighted) degree of vertex i in the graph.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EE 381 V : Large Scale Learning Spring 2013 Lecture 11 — February 19

The last two lectures focused on the algorithms and analysis of spectral clustering for Gaussian mixtures in the isotropic case, in which the i-th Gaussian has the distribution Xi ∼ N (μi, σ i I). Note that the covariance matrix is simply a multiple of the identity matrix, so each Gaussian is distributed spherically, as depicted in figure 11.1. Recall that in this case, we reduced dimension by ...

متن کامل

Lecture 7 — April 21 7.1 Spectral Clustering

is the degree matrix and elucidate some of its key properties that make it suitable for clustering. Suppose we form a weighted graph G with our data-points being the vertices and the edge weights being specified by the weight matrix W . Let us first consider the special case when the graph G has exactly k components with vertex sets A1, · · · , Ak. Then the Laplacian Lrw has the following prope...

متن کامل

Lecture 15 : Spectral clustering , projective clustering

Figure 15.1: k=3 clusters with red points chosen as facilities. Consider a situation where we have n point locations and we wish to place k facilities among these points to provide some service. It is desirable to have these facilities close to the points they are serving, but the notion of “close” can have different interpretations. The k-means problem seeks to place k facilities so as to mini...

متن کامل

Spectral Analysis of Data

In this lecture, we discuss some spectral techniques and their applications. The goal of spectral techniques is to study and deduce the characteristics of a matrix by looking at its spectrum (i.e. eigenvalues of matrix). Although spectral methods can be applied to any problem instance with matrix representation, we will be focussing on graph theoretic problems in this lecture. Over the past two...

متن کامل

Lecture 7 1 Spectral Clustering

Spectral clustering is a technique for segmenting data into non-overlapping subsets. It is used in many machine learning applications (e.g., von Luxberg 2007) and was introduced into computer vision by Shi and Malik (2000). The data is defined by a graph with an affinity, or similarity measure, between graph nodes. The computation – to segment the data – can be performed by linear algebra follo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017